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Abstract 

Chlamydia trachomatis (CT) and Neisseria gonorrhoeae (GC) are the agents of two common, sexually transmitted 
diseases afflicting women in the United States (http://www.cdc.gov). We designed a novel web-based application 
that offers simple recommendations to help optimize medical outcomes with CT and GC prevention and control 
programs. This application takes population groups, prevalence rates, parameters for available screening assays and 
treatment regimens (costs, sensitivity, and specificity), as well as budget limits as inputs. Its output suggests optimal 
screening and treatment strategies for selected at-risk groups, commensurate with the clinic's budget allocation. 
Development of this tool illustrates how a clinical informatics application based on rigorous mathematics might 
have a significant impact on real-world clinical issues. 



Introduction and background 

Chlamydia trachomatis (CT) and Neisseria gonorrhoeae 
(GC) are the etiological agents of the two most commonly 
reported sexually transmitted diseases (STDs) among 
women in the United States. In 2011, 1,412,791 cases of 
sexually transmitted CT infection were reported to the 
Centers for Disease Control and Prevention (CDC) [1]. 
This case count corresponds to a rate of 457.6 cases per 
100,000 population, an increase of 8% over 2010. A com- 
mon co-infection with CT [2], GC infection was reported 
a total number of 321,849 cases, corresponding to a rate 
of 104.2 cases per 100,000 population [1]. 

To control the spread of STDs, there are some screen- 
ing guidelines available to clinics. For example, the CDC 
recommends annual CT screening for sexually active 
adolescents and young women [3]. The U.S. Preventive 
Services Task Force (USPSTF) recommends screening all 
sexually active women, including those who are pregnant, 
for gonorrhea, if they are at increased risk for infection 
[4]. Recent data suggest that screening rates in young 
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women are low, with most young women not getting 
screened [5,6]. The CDC estimates that the incidence of 
CT is more than twice the number actually reported [7], 
at least partly because of low screening rates and the 
nature of CT infection, which is often asymptomatic. 
Perhaps another reason is that, detection is typically rele- 
gated to public clinics, which may have insufficient bud- 
gets to screen all eligible women. 

To improve the efficient use of limited clinical 
resources, mathematical resource allocation models have 
been developed to calculate an optimal solution regard- 
ing the selections of patient groups, screening assays, and 
treatment regimens [5-7]. The parameters used in these 
models typically come from published data [8]. However, 
they may be tailored to any particular demographic 
environment. Our goal, thus, has been to provide a rigor- 
ous mathematical framework, into which the end-user 
can insert specific parameters, adjusted to reflect local 
conditions and constraints. 

To achieve the goal, our approach employs three steps. 
First, we have designed a mathematical model as our the- 
oretical foundation to address both CT and GC infec- 
tions. Second, we have analyzed and interpreted the 
computational results of the proposed model. Finally, we 
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have implemented the mathematical model as a web-tool 
in which the local clinical manager is enabled to particu- 
larize strategies to local conditions and resources. 
Previously [8,9], we addressed the first two steps; here we 
elaborate the final stage. 

Method 

Mathematical formulation 

The proposed model is a nonlinear cubic binary model. 
We briefly introduce the model here; see our previous 
publications for details [8,9]. The patient population 
comprises m groups, with r available screening assays, s 
available treatment regimens with funding limitation b- 
We define the following three decision variables: 

)1 if patient group i is selected 
0 Otherwise 

)1 if screening assay j is selected 
0 Otherwise 

)1 if treatment regimen k is selected 
0 Otherwise 

The objective function is to maximize the likely rate of 
cured outcomes given the available screening assays and 
treatment regimens for given patient groups. 

m r s 

Max^Popi ■ Cur ijk ■ XiYjZk := ^ ^ ^Popi ■ Cur ijk ■ Xiy^l) 

i,j,k i=l J=l fe=l 

Subject to funding availability 

Po P> ■ Cost >jk ■ Xi^Zk < b 

i,j,k 

Where Popi represents the population of the i th group, 
Curyk and Costijk represent the expected rate of cured 
infection cases and the costs, correspondingly, over the 
population of the jth group using the j th screening assay 
and treated with fe th regimen. Assuming the same 
screening assay and the same treatment are applied to all 
patients, we have: 

r 5 

J2y> = land H Zk = 1 ( 3 ) 

;=1 k=l 

The solutions for the three decision variables give us an 
optimal strategy, maximizing the expected number of 
cured cases. This model is nonlinear, and can be con- 
verted into a knapsack problem (which is a NP-hard pro- 
blem) [8,9]. There is no simple, analytic solution to solve 
this model [10]. Instead we adopt a reasonably efficient, 
two-step branch-and-bound algorithm to give an exact 
solution. 



Implementation overview 

Our implementation plan is to provide highly configur- 
able and user-friendly, web-oriented software that allows 
a clinical manager to specify parameters such as preva- 
lence rate, budget availability, and costs. Accepting these 
user-specified parameters, the tool aims to compute a 
detailed optimal strategy commensurate with that bud- 
get. Additionally the users can explore several scenarios 
by adding/deleting patient groups, screening assays or 
treatment regimens. 

The application was developed using Java Enterprise 
Edition (rendering it portable to Windows, Linux, Unix or 
Mac OS), and employ Model-View-Controller (MVC) 
architecture and Object-relational (OR) mapping to reduce 
the amount of code, and Multi-thread programming to 
speed up computation. Dynamic-HTML (DHTML) is 
extensively to allow user configuration of any parameter 
combination of population, screening assays and treatment 
regimens easily. The application wide data is stored in 
MySQL, which saves the information of user and superu- 
ser. As for this information, only superusers can change 
the data structure. The same database also stored the para- 
meters for calculating the optimal strategy. For example, 
each screening assays and regimens (e.g. the sensitivity, 
specificity, unit costs and etc) based on the publish data is 
saved as default reference values. It will be loaded automa- 
tically as the any user initially login to the tool. The tool 
also allows users to over-ride these inputs as their local site 
may have individual scenario (e.g. higher/lower costs of the 
assays than the default one). 

Architecture and major modules 

Our application adopts a multi-layer MVC architecture 
shown in Figure la. From left to right, there are: Web 
Explorer layer, Web Server layer and MySQL Database 
layer. The Web Explorer layer includes the web pages 
(representing View) users use to send service requests and 
receive service responses. Web Server layer handles all the 
business logic to process user's requests. This layer also 
contains a controller component, which accesses applica- 
tion data in MySQL database (not shown in Figure 1). The 
processed result is stored in model components (Java- 
Beans) and routed back to the controller component, 
where it constructs the result page. 

The business logic can be categorized into two major 
modules. The data management module analyzes various 
application-wide data, such as user information, transac- 
tion data, population data, and screening and treatment 
data. It also identifies the user as anonymous or 
advanced-users, provides help and retrieval of the default 
settings. The population, screening and treatment mod- 
ule is used to customize population data by adding/ 
removing a population group. Accordingly, this module 
then automatically changes the structures of parameter 
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Figure 1 (a) Multi-layer MVC architecture, (b) The welcome page. Two types of users are required. 



input tables for infection and/or co-infection rates 
reflected in the population data. It is also used to custo- 
mize screening and/ or treatment choices. 

Results 

To achieve the goal of calculating an optimal strategy 
automatically, the mathematical model is successfully 
implemented by the new web-based tool (Figure lb and 
Figure 2). This web-tool has five main pages: population 
groups, infection rates, co-infection rates, screening setting 
and treatment setting. If a clinical manager has difficulties, 
there are also help windows available to provide tutorial 
information. These pages follow each other in sequence as 
a clinical manager submits his/her local parameters. For 
example, the mathematical variable Xj is configured within 
the "population group" page, where visiting patients are 
initially divided into 12 groups reflecting different 



populations at local clinics (Figure 2c). The corresponding 
local prevalence rate could be specified in the "infection 
rate" page (Figure 2a). The "co-infection rate" page speci- 
fies how likely that the CT patients in the population also 
have GC. The other two variables Yi and Zj control the 
decision on screening assays and treatment regimens, and 
are specified in the "screening setting" page (Figure 3) and 
"treatment setting" page (Figure 2b), correspondingly. 

After the required parameters for the model are speci- 
fied, this web-tool calculates the optimal strategy by sol- 
ving the proposed mathematical model with the accurate, 
two-step branch-and-bound algorithm. An example of a 
calculated optimal solution is shown in Figure 2d. It is 
interpreted as follows: after the local situation at a clinic 
is specified, the optimal solution recommends screening 
the black groups 20 or younger and 24 or older using BD 
ProbeTec CT, and to treat those showing positive 
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Figure 2 User interfaces and parameters need to be specified 



screening results with doxycycline. This tool also reports: 
given a pre-determined budget of $50,000 (default costs 
in Figure 2d), the plan suggested by the model can be 
expected to cure 96 patients given the local CT and GC 
prevalence rates. Furthermore, this tool also suggests that 



to achieve the expected cures, $46,370 (revised costs in 
Figure 2d) should be sufficient. 

The other part of the goal is to allow clinical experts to 
re-design the decision model. Several steps are needed to 
achieve this. First, a login page was designed to classify 
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Figure 3 User may tailor his/her model by adding and deleting the screening assays. 



users as "anonymous" or "advanced". Anonymous users are 
not required to have passwords to use this tool, and they 
can access the basic functions of the tool needed to calcu- 
late the optimal solutions as we describe above (Figure 2). 
To re-design the decision model, clinical professionals have 
to be authenticated as advanced-users. An advanced-user 
can add or delete population groups, screening assays and 
treatment regimens. The total number of underlying deci- 
sion variables %i, Yi and Zi are updated correspondingly. 

For example, Figure 3 illustrates a feature available to 
advanced-users, namely the addition of a new screening 
assay, including whether the assay is for CT or GC or 
both (this is accomplished in a pop-up window). After 
the advanced-user has added a new screening assay to 
the model, the tool will re-calculate the model taking the 
addition diagnostic assay into consideration, by augment- 
ing the terms of decision variable Yi. Analogously, add- 
ing/deleting population groups and treatment regimens 
will lead to a re-calculation with respect to changes in 
parameters related to *i and z,(the interface webpage is 
not shown). The "re-design" features give advanced-users 
flexibility to re-model new situations and to tailor the 
computation efficiently to his or her specific situation. 

Discussion 

Many efforts towards improvements of the quality of 
health care have resulted in the development of clinical 



decision-supports systems [11-13]. However, clinical 
practitioners seem prone to rely on their own experi- 
ence to solve problems instead of using decision aids 
[11,14]. The barrier is due in part to the fact that prac- 
tice guidelines (typically promulgated by organizations 
like the CDC) do not fit local clinical situation; this is 
certainly true in the case of sexually transmitted disease 
control programs. 

Significantly different from other clinical decision sup- 
port systems [11,15,16], this new web-based tool is 
designed to lower that barrier by enabling practical- 
minded, clinical managers to impose their view of local 
realities and still avail themselves of a rigorous mathema- 
tical model for the number-crunching. This is accom- 
plished without compromising ease of use, thanks to its 
user friendly interfaces and didactic instructions for add- 
ing or deleting new population groups, screening assays 
or treatment regimens. This design not only allows users 
to do "what-if analysis, by manipulating the mathemati- 
cal model with their own parameters, but also gives flex- 
ibility to accomplished users to re-parameterize the 
model virtually from scratch. To our knowledge, this is 
the first web-based tool (which utilizes a rigorous mathe- 
matical model) to offer a detailed, optimal strategy to 
select at-risk patient groups, as well as screening assays 
and treatment regimens for the control and prevention of 
CT and GC - all within a specified budgetary constraint. 
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Of course, there are limitations to the approach and 
challenges in its implementation. First, the new decision 
tool depends on the underlying mathematical model, 
which embodies necessary assumptions. Though para- 
meters can be adjusted, the underlying assumptions are 
fixed. Second, there is a theoretical computational limit 
while solving the model. For example, the complexity of 
the two-step branch-and-bound algorithm to the model 
has an overall running time of 0(n-m2 m ), where n is the 
number of the combinations of screening and treatment 
strategies satisfying conditions (3), and m is the number 
of population groups [8]. As the number of division in 
population groups, the choices of screening assays, and 
the availability of treatment regimens increase, the com- 
putational challenge increases. We are optimistic about 
overcoming the computational challenge for following 
reasons. The values of m and n are not huge numbers in 
reality. The availability of regimens determines the value 
of n. There are usually practical guidelines at each clinic, 
regarding how to partite patients into m groups. Com- 
mercial software applications may adopt approximation 
algorithms for solving the proposed model, too. For 
example, Excel Solver's approximation algorithm some- 
times calculates near-optimal solutions, while the 
two-step branch-and-bound algorithm is an exact algo- 
rithm which always calculates the optimal solution. We 
demonstrated the advantage of using the two-step 
branch-and-bound algorithm over Excel Solver's approxi- 
mation algorithm, in term of the computational accuracy 
and the running time [8]. A third challenge is to provide 
a reasonably quick service response - an important factor 
for users expecting a timely browsing experience. To 
overcome this obstacle and we have designed a dedicated 
logic handler on the basis of a multi-thread programming 
technique. We provide detail on threading techniques in 
the Appendix. The computation time of the algorithm 
thus becomes a matter of seconds [8]. Cutting edge 
technology and advanced algorithm design rise to meet 
the computational challenge and to satisfy user expecta- 
tion of a quick response. Fourth, we are aware that the 



new application needs to be tested and evaluated by 
clinical managers so that it can be improved, both with 
respect to its user-interfaces and its back-end algorithm. 
We are currently actively seeking collaboration with 
clinicians to evaluate this tool. A short follow-up report 
of actual use will be ready once we have a beta testing 
within a clinic and across sites evaluations could be also 
reported after we receive feedback from more clinics. 

Hopefully, with cooperative interaction between clini- 
cian and mathematician, these limitations can be ame- 
liorated, resulting in an improved tool. We are 
optimistic that successful implementation of this tool 
will highlight the feasibility of applying complicated 
mathematical models to practical clinical problems via a 
powerful informatics approach. 

Appendix 

The algorithm of the logical handler is sketched in 
Table 1 (for master thread) and Table 2 (for slave 
threads). 

Note that in Table 1, after all combinations are cre- 
ated, they are categorized into different types. For 
example, one of the types is ct-single-screening-single- 
treatment, which stands for the combination of screen- 
ing and treatment plan for CT where both the screen- 
ing and treatment plan are a single plan (only used for 
screening/treating a single disease). After all types are 
created, they are distributed into slave threads (one 
type is processed by each a slave thread) to calculate 
the number of cured people among the population 
groups. The processing result is fed back into the mas- 
ter thread. The algorithm for slave threads is described 
in Table 2. 

After calculating the number of people expected to be 
cured, as well the associated cost of the given type of 
combinations, the optimal results are obtained by sol- 
ving several "knapsack" problems. For insight on how to 
convert this mathematical model into knapsack pro- 
blems and the details of two-step branch-and-bound 
algorithm, please refer our previous publication. [8] 



Table 1 The algorithm for the master thread 

Procedure master [groups, screenings, treatments) 
[screeningsCT, screeningsGQ = identify the list of screening plans for CT and GC, respectively. 
{treatmentsCT, treatmentsGQ = identify the list of treatment plans for CT and GC, respectively. 
FOR I = 1 TO screeningsCT _size 
FOR J = 1 TO screen/ngsGC_size 
FOR K = 1 TO treatmentsCT _s\ze 
FOR L = 1 TO treafmenfsGC_size 
Create a combination of screeningsCT[l], screeningsCT[J], treatmentsCT[K], and treatmentsGQL]. 
Categorize all combinations into different types. 
Distribute each type of combinations into a slave thread. 
Wait for slave threads to finish the computation. 
Collect all results and calculate the final optimal result. 
Return the optimal result to logic handler. 
End procedure master. 
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Table 2 The algorithm for slave threads 

Procedure slave(groups, screenings, treatments, combinations, budget) 
FOR I = 1 TO combinations_s\ze 

Get screenings screening2 from combinations^] and screenings; 

Get treatment^ treatment2 from comb/not/ons[l] and treatments; 

FOR J = 1 TO groups_size 

Update combinations^] by adding the number of cured people in groupsU] given screenings screening2, treatments and 
treatment2. 

Update combinations[\] by adding the cost for curing people in groupsU] given screenings screening2, treatments and treatment.2. 
Run knapsack algorithm to get the local optimal results for this type of combinations under budget. 
Return the local optimal results to the master thread. 
End procedure slave. 
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